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In this paper we examine the effects of an algorithm for syntactic 
analysis on word recognition accuracy. The behavior of the algorithm 
is studied by means of a computer simulation. We describe the syntactic 
analysis technique, the problem domain to which it was applied, and 
the details of the simulation. We then present the results of the simu- 
lation and their implications. We find, for example, that an acoustic 
word error rate of 10 percent is reduced to 0.2 percent after syntactic 
analysis, resulting in a sentence error rate of 1 percent. These figures 
are based on a 127-word vocabulary and an average of 10.3 words per 
sentence for 1000 sentences. We expect that these results are indicative 
of the performance which will be attained by a real speech recognition 
system which uses the syntactic analysis algorithm described 
herein. 

I. INTRODUCTION 

The utility and flexibility of a speech recognition system can be sub- 
stantially expanded if it can accept sentence length utterances rather 
than single words. Simultaneously, accuracy can be greatly improved 
by exploiting the grammatical constraints of language on the input 
sentences. li2,3 

The purpose of this investigation is to establish, by means of a com- 
puter simulation, how much improvement in reliability can be obtained 
by using a particular optimal method for syntactic analysis in conjunc- 
tion with an isolated word recognition system. 

This paper is in five sections. First we give a description of the method 
of syntax analysis under consideration. In the second section we describe 
the content and the semantic and grammatical structure of the problem 
domain to which we intend to apply the analysis. The third section is 
devoted to a description of the simulation, particularly of the acoustic 
recognizer and the procedure for generating random sentences. In the 
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Fig. 1 — Block diagram of the speech recognition system. 

fourth section we present the results of the simulation. We conclude with 
an evaluation of the results and a brief discussion of directions for further 
investigations. 

It should be said at the outset that the results of this experiment are 
encouraging. We find that, by using the grammatical constraints for our 
problem domain, the syntax analyzer reduces the acoustic error rate from 
10 percent to 0.2 percent. This results in a sentence error rate of 1 per- 
cent. The sentences were composed from a 127-word vocabulary and 
contained an average of 10.3 words per sentence over 1000 randomly 
generated sentences. 

We believe that these figures, with the more detailed results given in 
Section IV, are indicative of those which will be attained when the 
method of syntax analysis described herein is incorporated into a real 
speech recognition system. 

II. AN OPTIMAL ALGORITHM FOR SYNTAX ANALYSIS 

The type of speech recognition system we are evaluating is shown in 
Fig. 1, and its operation may be formally described as follows. 

Let the language, L, be the subset of English used in a particular 
speech recognition task. Sentences in L are composed from the vocab- 
ulary, V, consisting of the M words 0i,&2»< • - u m- Let W be an arbitrary 
sentence in the language. Then we write W e L and 

W = W\U)2 '"U)k (1) 

where each «;, is a vocabulary word which we signify by writing wi e V 
for K i ^ k. Clearly W contains k words, and we will often denote this 
by writing \W\ = k. Similarly the number of sentences in L will be de- 
noted by \L\. 

The sentence W of eq. (1) is encoded in the speech signal x(t) and 
input to the acoustic recognizer from which is obtained the probably 
corrupted string 

W = wiw 2 •■•w k (2) 

where ri),- e V for 1 < i < k but W is not, in general, a sentence in L. 

The acoustic recognizer also produces the matrix [dy ] whose ijth entry, 
dy, is the distance, as measured by some metric in an appropriate pattern 
space, from the ith word, u), to the prototype for the/th vocabulary word, 
Vj, for 1 < i < k and 1 <;* < M. 
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Fig. 2 — Example state transition diagram. 



The syntax analyzer then produces the string 

W = W1W2 • • • Wk (3) 

for which the total distance, D(W), given by 

D(W)= tdijil^Ji^M (4) 

is minimized subject to the constraint that W e L. Thus the syntax 
analysis is optimal in the sense of minimum distance. 

Since, in general, W $ L, whereas W was assumed to be grammatically 
well-formed (i.e. W e L), the process should correct word recognition 
errors. 

In principle one could minimize the objective function of eq. (4) by 
computing D{ W) V W e L and choosing the smallest value. In practice, 
when \L \ is large, this is impossible. One must perform the optimization 
efficiently. It has been shown by Lipton and Synder 4 that for a particular 
class of languages one can minimize D( W) in time proportional to \W\. 
In fact one can optimize any reasonable objective function in time linear 
in the length of the input. 
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The particular class of languages for which the efficiency can be at- 
tained is called the class of Regular languages. For the purposes of this 
discussion we shall define the class of Regular languages as that class for 
which each member language can be represented by an abstract graph 
called a state transition diagram. 

A state transition diagram consists of a finite set of vertices or states, 
Q, and a set of edges or transitions connecting the states. Each such edge 
is labeled with some i> ( e V. The exact manner of the interconnection 
of states is specified symbolically by a transition function, 8, where 

«:(Q XV)-Q (5) 

That is, if a state qi e Q is connected to another state qj e Q by an edge 
labeled v m e V then 

b(qi,v m ) = qj (6) 

We also define a set of accepting states, ZcQ, which has the significance 
that a string W = W\W2 • • • Wk, where Wi e V for 1 < i < k, is a well- 
formed sentence in the language, L, represented by the state transition 
diagram if and only if there is a path starting at q i and terminating in 
some qj e Z whose edges are labeled, in order, w\,wi, • - • Wk- 
Alternatively we may write W e L iff 






Mqj k -v w k) = Qj k e z 






(7) 



We may then define the language, L, as the set of all W satisfying eq. 
(7). An example of these concepts is shown in Fig. 2. The accepting states 
are marked by asterisks. 

While the definition given above of a Regular language is mathema- 
tically rigorous, it is not the standard one used in the literature on formal 
language theory but rather has been specifically tailored to the notational 
requirements of this paper. The interested reader is urged to refer to 
Hopcroft and Ullman 10 for a standard and complete introduction to 
formal language theory. 

In the following discussion we shall restrict ourselves to finite Regular 
languages, i.e., those for which \L\ is finite. This restriction in no way 
alters the theory but its practical importance will become obvious in what 
follows. The finiteness of the language implies that its state transition 
diagram has no circuits, i.e., no paths of any length starting and ending 
at the same state. Thus there is some maximum sentence length which 
we shall denote, / max . 
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We now turn to the problem of efficiently solving the minimization 
problem of eq. (4). To do this we shall define two data structures $ and 
^ which will be used to store the estimates of D (W) and W, respec- 
tively. 

The first stage of the algorithm is the initialization procedure in which 
we set 



<M<?) = 



[0 for q = q and i = 
°° otherwise 
*i(q) = Ki<\W\=k; VqeQ (8) 

The data structures have two indices. The subscript is the position of 
the word in the sentence and the argument in parentheses refers to the 
state so that the storage required for each array is, at most, the product 
of J max + 1 and | Q | , the number of states in the set Q. 

After initialization we utilize a dynamic programming technique de- 
fined by the following recursion relations: 

* i (q) = mmii> i - 1 (q p ) + d ij \ (9) 

A 

where the set A is given by: 

A = \8(q p ,Vj) = q\ (10) 

Then 

*i(q) = ?i-i(9p)lfc (ID 

where w, is just the Uj which minimizes «J>, (q). Equation (11) is under- 
stood to mean that the word w, is simply appended to to the string 

*i-i(<7 P )- 

Unfortunately the concatenation operation is not easily implemented 
on general purpose computers so we change the recursion of eq. (11) by 
making ^ into a linked list structure of the form: 

*i»(g) = Q P 

*2«(<7) = Wi (12) 

Then when i — k we can trace back through the linked list of eq. (12) and 
construct the sentence W as follows: First find qf e Z such that 

<*>*((?,) = min{<M<?)) (13) 

qeZ 

set q = qf and then for i = k,k — l,k — 2,. . . ,1 

wi = *2i(q) 

q = *«(«) (14) 
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Table I — Example [dy] matrix 



Code 


Vocabulary word 


7 = 1 


7 = 2 


7 = 3 


7 = 4 


7 = 5 


1 


la 


9 


9 


1 


8 


4 


2 


Fare 


2 


2 


5 


3 


1 


3 


I 


7 


3 


2 


3 


2 


4 


Want 


2 


9 


4 


7 


3 


5 


Would 


1 


5 


4 


8 


2 


6 


Like 


2 


5 


2 


6 


5 


7 


Some 


2 


1 


9 


8 


3 


8 


Information 


7 


7 


4 


7 


8 


9 


Please 


2 


3 


2 


4 


9 


10 


To 


4 


5 


8 


1 


7 


11 


Make 


6 


3 


9 


8 


5 


12 


A 


4 


7 


6 


9 


8 


13 


Reservation 


3 


6 


7 


8 


9 


14 


Return 


9 


7 


6 


4 


8 


15 


The 


8 


6 


5 


2 


3 


16 


Morning 


3 


4 


5 


6 


7 


17 


First 


8 


6 


8 


7 


5 


18 


Class 


5 


5 


4 


3 


9 


19 


Seat 


9 


9 


8 


7 


3 


20 


Non-stop 


3 


3 


4 


5 


8 


21 


Flight 


9 


8 


3 


5 


6 


22 


Will 


6 


7 


7 


6 


6 


23 


Pay 


4 


4 


4 


4 


3 


24 


In 


3 


3 


3 


6 


9 


25 


Cash 


5 


4 


3 


7 


6 


26 


How 


2 


9 


8 


7 


5 


27 


Much 


6 


2 


8 


4 


9 


28 


Need 


7 


6 


5 


4 


3 



Thus the sentence W is computed from right to left. 

The operation of the above algorithm is illustrated in Tables I, II, and 
III. Table I shows the vocabulary words of the language diagrammed in 
Fig. 2 along with numerical codes and a sample [dy] matrix. Table II 
shows the details of the operation of the algorithm for i = 0,1,2. Table 
III shows the results after the sentence has been completely analyzed. 

By locating the smallest entry in each column of the sample [d y ] 
matrix of Table I, it can be seen that the acoustic transcription of the 
sentence from which this matrix was produced is: WOULD SOME IS TO 
FARE. Clearly this is not a valid English sentence nor is there any path 
through the state transition diagram of Fig. 2 whose edges are so la- 
beled. 

Following Table II the reader can trace the operation of the algorithm 
as it computes the valid sentence having the smallest total distance. First 
the $ and ^ arrays are initialized according to eq. (8). To make the figure 
easier to read, this has been shown only for i = 0. 

Note that there are two transitions from state 1; one to state 2 labeled 
I and the other to state 8 labeled HOW. Accordingly $i(2) is set to 7, the 
metric for I; ¥u(2) is set to 1, the state at the beginning of the transition 
and ^21 is set to 3, the code for the transition label I. Similarly *i(8) is 
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Table II — Detailed operation of the first three stages of the 

algorithm want 



Position 
Code Word 1 2 
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3 
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Want 
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28 


Need 
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Would 
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22 
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Table III 




Final results of the algorithm 
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set to 2, the metric for HOW; ¥11(8) is set to 1 as before and ^21(8) is set 
to 26, the code for HOW. All other entries remain unchanged. 

In the next stage more transitions become possible. Note in particular 
that there are two possible transitions from state 2 to state 3. In accor- 
dance with eq. (9), the one labeled NEED is chosen since it results in the 
smallest total distance, 13, which is entered in $ 2 (3); ^12(3) is set to 2, 
the previous state, and ^ 2 2(3) is set to 28, the code for NEED. Transitions 
to state 7, 9, and 22 are also permissible and thus these columns are filled 
in according to the same procedure. 

EFFECTS OF SYNTACTIC ANALYSIS 1633 









The completion of this phase of the algorithm results in $ and ^ as 
shown in Table III. We can now trace back according to eqs. (13) and (14) 
to find W. From Fig. 2 we see that there are two accepting states, 5 and 
6. #5(6) is 8 which is less than 30, the value of $5(5), so we start tracing 
back from state 6. The optimal state sequence is, in reverse order, 
6,11,10,9,8,1. The corresponding word codes which, when reversed, de- 
code to the sentence: HOW MUCH IS the fare. 

One final note: from the operation of the algorithm it should be clear 
that it is not necessary to retain $i(q) for.O < i < |W|. At the ith stage 
one needs only <£,-i(<7) to compute $i(q). Thus the storage requirements 
are nearly halved in the actual implementation. 

In closing we should note that this scheme is formally the same as 
(though conceptually different from) the Viterbi 5 algorithm and similar 
to methods used by Baker 6 and stochastic parsing techniques discussed 
in Fu 7 and Paz. 8 The crucial difference is that in the cited references, 
estimates of transitions probabilities are used whereas in this method 
the transitions are deterministic and the probabilities used are only those 
conditioned on the input x(t). 

III. THE SEMANTIC AND GRAMMATICAL STRUCTURE OF THE 
RECOGNITION TASK 

In this section we shall discuss the application of the abstractions of 
the previous section to a particular speech recognition problem domain. 
The task which was finally selected was that of an airline information 
and reservation system. The choice was made for three reasons. First, 
the problem is difficult enough so that even under some artificial con- 
straints, it is a significant test of the above described techniques. Second, 
previous work by Rosenberg and Itakura 9 which used single words rather 
than sentences composed of isolated words as input was available for 
purposes of comparison. Third, it affords the opportunity to add modes 
of human/machine communication such as speaker verification and voice 
response. 

The semantics of the language we designed limits a user to the fol- 
lowing types of messages. First, one may state the desired kind of 
transaction (i.e., requesting flight information or making a reservation). 
Then, one may make a reservation either by providing all necessary in- 
formation in one sentence or by giving answers to questions as required. 
The user may select arrival and/or departure dates, times and cities, 
number of stops, number and class of seats, specific flight numbers and 
aircraft types. Alternatively, the user may ask questions about arrival 
and/or departure dates, times and locations of specific flights, the type 
of aircraft, the number of stops, the fare, the number of meals served and 
the flight time. Finally, he may request a repeat of any information or 
supply telephone numbers and method of payment. 
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For most of the above messages there are several acceptable gram- 
matical structures of sentences conveying the same semantic informa- 
tion. 

In order to limit the complexity of the syntactic analysis, certain ar- 
bitrary constraints were imposed: 

(i) Dates consist of two digits followed by the name of the month. 

(ii) Flight numbers are limited to one or two digits. 

(Hi) The vocabulary includes the names of only ten cities. 

(iv) The length of the longest sentences is 22 words. 
It was felt that these constraints could all be relaxed if so desired without 
making major system modifications. 

Next we give an informal specification of the syntactic structure. In 
this description phrases enclosed in curly brackets are alternatives. 
Those enclosed in square brackets are optional, while those within angle 
brackets represent a class of words of the indicated type. 



I 



WANT 
WOULD LIKE 



SOME INFORMATION 
TO MAKE A RESERVATION 



[please] 



GO 



WANT \ LEAVE 

TO 
WOULD LIKE | RETURN 

DEPART 






[FROM < city)] [to (city)] 



[ON][<day>] 



MORNING 
AFTERNOON 

EVENING 
*■ V NIGHT 



[the (date)] 



AT WHAT TIME 1 
WHEN 



HOW MANY FLIGHTS 



DO FLIGHTS LEAVE [(city)] FOR (city) 
ARE THERE! 



TO (city) ON [day] 



[from (city)] 
THE (date)] 



GO 

.MORNING 
| AFTERNOON 

| EVENING 
fc * NIGHT 

WHAT plane IS ON FLIGHT (flightnumber) [to (city)] 

(WHAT 
IS THE FARE [FROM (city)] [TO (city)] 
HOW MUCH 

IS A MEAL SERVED ON [THE] FLIGHT [( flightnumber )][TO (city)] 
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WANT 

I J WOULD LIKE J FLIGHT [number] (flightnumber) 

WILL TAKE 



[TO <city>] 



ON <day> 



MORNING 
AFTERNOON 
NIGHT 
EVENING 



[the <date>] 



WANT 
I I NEED 

WOULD LIKE 



,j- -.v rf FIRSTCLASS ll r i 

<dlglt> [|cOACH U SEAT[Sl - 



I prefer THE [(manufacturer)] (aircraft type). 



WANT 
WOULD LIKE 



a.m. 
TO GO AT (hour) / p.m. 



O'CLOCK 



(flightnumber) 
FARE 
ARRIVAL TIME [s] 

I ' l . R A 3 E U ;•; 1 ' E AT I '11 K \ DEPARTURE 

PLANE 

NUMBER OF MEALS 
FLIGHTS 

DOES FLIGHT (flightnumber) 



AT WHAT TIME 
WHEN 

FROM 
TO 



[f 



(city) I; 



I WILL PAY BY 



| [ ARRIVE | 
DEPART 
CASH 
AMERICAN EXPRESS 

DINERS CLUB 
MASTER CHARGE 



MY 



HOME 1 



PHONE [NUMBER] IS [AREA CODE] 



OFFICE J 

[(area code number)] (phone number) 

f WANT 



I 



A NON-STOP FLIGHT 



WOULD LIKE 
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HOW MANY STOPS ARE THERE ON FLIGHT (flight number) 

.MORNING 



nFROMl 
TO 



ity>] 



(city) 



ON (day) 



(AFTERNOON 

| EVENING 
k NIGHT 



[THE (DATE)] 



WHAT IS THE FLIGHT TIME [FROM (city)] [TO (city)] 

The above description is too informal to define every detail of the 
Flight Information Language. It should, however, give the reader a 
feeling for the basic syntax and semantics. This specification of the 
language is quite useless for the purpose of the syntax analysis algorithm. 
For that purpose we have produced a formal specification of the lan- 
guage, the state transition diagram for which is shown in Fig. 3. From 
this graph it may be seen that | V\ = 127; \8\ = 450; |Q| = 144 and \Z\ 
= 21 with the accepting states being designated by asterisk. 

IV. DETAILS OF THE SIMULATION 

The simulation of the speech recognition system based upon the 
analysis described above may be treated as three separate problems. 
They are: (i) Random generation of many well-formed sentences, (ii) 
computing a [dy] matrix for each in such a way that the number of 
acoustic errors resulting from a nearest-neighbor decision rule is con- 
trollable, and (Hi) syntactically analyzing the sentences and tabulating 
the appropriate statistics automatically. We shall now discuss these in 
order. 

The method for generating random sentences in the language is just 

the following algorithm: 

(i) W — )Qi ? — q i ( W gets the null string) 

at the ith stage, 
(m) Chose a word, a;*, e V 
(Hi) If 6(qi,Wki) * Qj for some / go to (2) 

else W-*- Wutki (W gets itself concatenated with W ki ) 

Qi — Qj 
(iu) IfqeZ and p < d where 

p is a pseudorandom number and 6 is some threshold, STOP; else 

go to (2) 
When the procedure terminates, W = w kl w k2 . . .w kl ; I < /max. Obviously 
by changing the threshold, 0, one can vary the average length of the 
sentences produced. In the actual simulation, p was uniformly distrib- 
uted on (V 2 , -V2) and was set to -0.25 producing an average sentence 
length of 10.3 words. 

In all, 42,000 such sentences were generated. In addition we made up 
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Fig. 3 — State transition diagram of the Flight Information and Reservation Lan- 
guage. 



1638 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1978 



OH, TWO, 
pREFE R THE BOEING SEVEN FOUR ,^-^SEVEN 




DOES FLIGHT NUMBER (DIGIT) (DIGIT) TO, FROM 
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Fig. 3 (continued) 

an additional 171 sentences averaging 9.7 words in length to use as a 
check against the randomly generated sentences. The additional sen- 
tences comprised several realistic transactions between airline customers 
and an automated flight information and reservation system. 
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The procedure for generating a [dy] matrix simulating one which 
might be produced by acoustic recognition for the randomly selected 
sentence W = w\W2 • • • u>k is as follows. Corresponding to each vj e V 
we assign the five-dimensional Gaussian density function 

pj(x) = (2tt)- 5 /2| C/I-i/^-i^C*-*^- 1 ^-*;) 7 (15) 

where the covariance matrix, U, was chosen, for convenience, to be 

t2 



u 





<r 2 



a* J 



(16) 



for selected values of a 2 . The mean vectors 

rhj = (m 1 j,m 2 j l m3j,m4j,m5j) 
were fixed by selecting the my at random from 
(2 



ma — 



1 forl</<5;l<;'<|^| 




(17) 



until the 127 mean vectors were defined. Then for wi = Vj, a random 
vector y was drawn from Pj(x) and distances were computed according 
to: 

dv=\\y-*j\\ forl<j<|V| (18) 

where the norm is the simple Euclidean distance. Equation (18) was 
evaluated for Ki < k thus all entries in the [dg] matrix were computed 
for each sentence W. 

The acoustic recognition was simulated by a nearest neighbor rule so 
that Wi = Vj if 

4 v <d fa forl<n<|V| (19) 

Again, eq. (19) was applied for K i < k and ties were arbitrarily broken. 
Clearly by changing the value of a 2 in eqs. (15) and (16) the simulated 
acoustic error rate can be varied with small values of a 2 producing low 
error rates. 

Given the foregoing discussion, description of the simulation is quite 
simple. A set of 1000 random sentences was generated and its distance 
matrices computed. The sentences were syntactically analyzed and errors 
counted. This was done for 0.05 < a < 2.1 with a being incremented by 
0.05 for each set of 1000 sentences. 

In addition, the specially formulated 171 sentences were typed in and 
processed. In this case a was fixed at a value of 0.245 which resulted in 
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Fig. 4 — Error rates as a function of a. 



an acoustic error rate of 11 percent which is close to the value observed 
by Rosenberg and Itakura 9 for a similar word recognition task. Error 
rates were measured for these sentences as well. 

V. SIMULATION RESULTS 

The overall results of the simulation are encouraging, showing con- 
siderable improvement in word and sentence recognition accuracy. Given 
an acoustic error rate of 10 percent on 1000 randomly generated sen- 
tences averaging 10.3 words per sentence, syntactic analysis reduces the 
word error rate to 0.2 percent resulting in a sentence error rate of 1 
percent. A sentence is in error if even one word is improperly classified. 
For the test set of 171 sentences containing 1662 words, an 11 percent 
acoustic word error rate was lowered to 0.2 percent after syntactic 
analysis resulting in a 1.2 percent sentence error rate. The actual time 
required to analyze a 22-word sentence on the Data General Nova 840 
is a small fraction of a second. 

Details of the results are best given in the accompanying figures. 
Figure 4 is a plot of acoustic and syntactic word error probabilities as a 
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1.0 



function of a of eqs. (15) and (16). Each point marked with a symbol is 
an observed data point based on the results obtained from 1000 sen- 
tences. It should be noted that the results for each point are based upon 
different sentences. The solid lines are obtained from a nonlinear 
least-squares fit of the data to the equation 



P, = 



aia 2 



a^a 2 + (a x — a 3 a 2 )e a i" 



(20) 



The method used in fitting the data is described by McCalla. 11 The curve 
of eq. (20) is called a logistic curve; its significance is discussed in detail 
by Braun. 12 For the simulated data the standard deviation of the actual 
data from the fitted curve was <0.005. 

Figure 5 shows the word and sentence error probabilities as a function 
of the acoustic word error probability. Once again the marked points are 
derived from sets of 1000 randomly generated sentences while the solid 
curves are obtained by fitting the data to an exponential curve of the 
form 



P ep = aie«2 p * 



(21) 
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Once again the resulting fits were good having a standard deviation 
of <0.004. 

VI. CONCLUSIONS 

It is clear from the results of the simulation that the syntax analysis 
algorithm is very fast and effective in eliminating word recognition errors 
which occur at the acoustic level. It is expected that the performance of 
the algorithm for real speech input will depend on the characteristics 
of the acoustic recognizer and the task language. However, we believe 
that our results are indicative of the performance which can be attained 
in real speech recognition systems. 

There are several areas for further research which are immediately 
suggested by this work. Although this entire presentation has been ori- 
ented toward the grammatical structure of sentences, the method de- 
scribed is certainly not restricted to that area. For example, the phonemic 
structure of words can be specified by a formal language as can the 
composition of acoustic features into phenomes. We therefore feel that 
optimal syntax analysis methods will be useful in more difficult speech 
recognition tasks than the one described here. 

Another useful extension of the technique would be achieved by re- 
taining the same optimality criterion while relaxing the restriction that 
| W | = | V^| . In other words, the algorithm would be allowed to insert and 
delete words. This could be an important aid in the solution of the seg- 
mentation problem in continuous speech. 

On the theoretical side, it would be enlightening to derive analytical 
expressions for the average probability of error for the syntax analyzer, 
given the properties of the language and a characterization of the acoustic 
recognizer. Perhaps for this purpose the entropy or redundancy of the 
language might be sufficient, while the acoustic recognizer might be 
viewed as a noisy channel and characterized by its equivocation or ca- 
pacity. In any event, it seems obvious that an information theoretic 
analysis would provide insights into the behavior of speech recognition 
systems. 

Finally we note that Regular languages are the most simple syntactic 
structures. One naturally wonders whether efficient, optimal methods 
exist for formal languages of much greater complexity which would be 
better models of Natural Language. 

In summary, we may say that optimal syntactic analysis techniques 
are useful and powerful tools to be used in tractable speech recognition 
tasks as well as being interesting mathematical objects. 
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